Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Plant Sci ; 15: 1339132, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38357267

RESUMO

Metabolic pathway drift has been formulated as a general principle to help in the interpretation of comparative analyses between biosynthesis pathways. Indeed, such analyses often indicate substantial differences, even in widespread pathways that are sometimes believed to be conserved. Here, our purpose is to check how much this interpretation fits to empirical data gathered in the field of plant and algal biosynthesis pathways. After examining several examples representative of the diversity of lipid biosynthesis pathways, we explain why it is important to compare closely related species to gain a better understanding of this phenomenon. Furthermore, this comparative approach brings us to the question of how much biotic interactions are responsible for shaping this metabolic plasticity. We end up introducing some model systems that may be promising for further exploration of this question.

2.
PLoS Comput Biol ; 19(8): e1011404, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37651409

RESUMO

Numerous computational methods based on sequences or structures have been developed for the characterization of protein function, but they are still unsatisfactory to deal with the multiple functions of multi-domain protein families. Here we propose an original approach based on 1) the detection of conserved sequence modules using partial local multiple alignment, 2) the phylogenetic inference of species/genes/modules/functions evolutionary histories, and 3) the identification of co-appearances of modules and functions. Applying our framework to the multidomain ADAMTS-TSL family including ADAMTS (A Disintegrin-like and Metalloproteinase with ThromboSpondin motif) and ADAMTS-like proteins over nine species including human, we identify 45 sequence module signatures that are associated with the occurrence of 278 Protein-Protein Interactions in ancestral genes. Some of these signatures are supported by published experimental data and the others provide new insights (e.g. ADAMTS-5). The module signatures of ADAMTS ancestors notably highlight the dual variability of the propeptide and ancillary regions suggesting the importance of these two regions in the specialization of ADAMTS during evolution. Our analyses further indicate convergent interactions of ADAMTS with COMP and CCN2 proteins. Overall, our study provides 186 sequence module signatures that discriminate distinct subgroups of ADAMTS and ADAMTSL and that may result from selective pressures on novel functions and phenotypes.


Assuntos
Redes Reguladoras de Genes , Humanos , Filogenia , Sequência Conservada , Fenótipo
3.
Genome Res ; 2023 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-37468308

RESUMO

Comparative analysis of genome-scale metabolic networks (GSMNs) may yield important information on the biology, evolution, and adaptation of species. However, it is impeded by the high heterogeneity of the quality and completeness of structural and functional genome annotations, which may bias the results of such comparisons. To address this issue, we developed AuCoMe, a pipeline to automatically reconstruct homogeneous GSMNs from a heterogeneous set of annotated genomes without discarding available manual annotations. We tested AuCoMe with three data sets, one bacterial, one fungal, and one algal, and showed that it successfully reduces technical biases while capturing the metabolic specificities of each organism. Our results also point out shared and divergent metabolic traits among evolutionarily distant algae, underlining the potential of AuCoMe to accelerate the broad exploration of metabolic evolution across the tree of life.

4.
BMC Genomics ; 23(1): 216, 2022 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-35303798

RESUMO

BACKGROUND: In eukaryote transcriptomes, a significant amount of transcript diversity comes from genes' capacity to generate different transcripts through alternative splicing. Identifying orthologous alternative transcripts across multiple species is of particular interest for genome annotators. However, there is no formal definition of transcript orthology based on the splicing structure conservation. Likewise there is no public dataset benchmark providing groups of orthologous transcripts sharing a conserved splicing structure. RESULTS: We introduced a formal definition of splicing structure orthology and we predicted transcript orthologs in human, mouse and dog. Applying a selective strategy, we analyzed 2,167 genes and their 18,109 known transcripts and identified a set of 253 gene orthologs that shared a conserved splicing structure in all three species. We predicted 6,861 transcript CDSs (coding sequence), mainly for dog, an emergent model species. Each predicted transcript was an ortholog of a known transcript: both share the same CDS splicing structure. Evidence for the existence of the predicted CDSs was found in external data. CONCLUSIONS: We generated a dataset of 253 gene triplets, structurally conserved and sharing all their CDSs in human, mouse and dog, which correspond to 879 triplets of spliced CDS orthologs. We have released the dataset both as an SQL database and as tabulated files. The data consists of the 879 CDS orthology groups with their detailed splicing structures, and the predicted CDSs, associated with their experimental evidence. The 6,861 predicted CDSs are provided in GTF files. Our data may contribute to compare highly conserved genes across three species, for comparative transcriptomics at the isoform level, or for benchmarking splice aligners and methods focusing on the identification of splicing orthologs. The data is available at https://data-access.cesgo.org/index.php/s/V97GXxOS66NqTkZ .


Assuntos
Genoma , Splicing de RNA , Processamento Alternativo , Animais , Cães , Éxons , Humanos , Camundongos , Isoformas de Proteínas/metabolismo
5.
Mol Biol Evol ; 38(9): 3754-3774, 2021 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-33974066

RESUMO

Extreme halophilic Archaea thrive in high salt, where, through proteomic adaptation, they cope with the strong osmolarity and extreme ionic conditions of their environment. In spite of wide fundamental interest, however, studies providing insights into this adaptation are scarce, because of practical difficulties inherent to the purification and characterization of halophilic enzymes. In this work, we describe the evolutionary history of malate dehydrogenases (MalDH) within Halobacteria (a class of the Euryarchaeota phylum). We resurrected nine ancestors along the inferred halobacterial MalDH phylogeny, including the Last Common Ancestral MalDH of Halobacteria (LCAHa) and compared their biochemical properties with those of five modern halobacterial MalDHs. We monitored the stability of these various MalDHs, their oligomeric states and enzymatic properties, as a function of concentration for different salts in the solvent. We found that a variety of evolutionary processes, such as amino acid replacement, gene duplication, loss of MalDH gene and replacement owing to horizontal transfer resulted in significant differences in solubility, stability and catalytic properties between these enzymes in the three Halobacteriales, Haloferacales, and Natrialbales orders since the LCAHa MalDH. We also showed how a stability trade-off might favor the emergence of new properties during adaptation to diverse environmental conditions. Altogether, our results suggest a new view of halophilic protein adaptation in Archaea.


Assuntos
Euryarchaeota , Halobacterium , Malatos , Filogenia , Proteômica
6.
Mol Phylogenet Evol ; 136: 104-118, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30980935

RESUMO

Genes showing versatile functions or subjected to fast expansion and contraction during the adaptation of species to specific ecological conditions, like sensory receptors for odors, pheromones and tastes, are characterized by a great plasticity through evolution. One of the most fascinating sensory receptors in the family of TRP channels, the cold and menthol receptor TRPM8, has received significant attention in the literature. Recent studies have reported the existence of TRPM8 channel isoforms encoded by alternative mRNAs transcribed from alternative promoters and processed by alternative splicing. Since the first draft of the human genome was accomplished in 2000, alternative transcription, alternative splicing and alternative translation have appeared as major sources of gene product diversity and are thought to participate in the generation of complexity in higher organisms. In this study, we investigate whether alternative transcription has been a driving force in the evolution of the human forms of the cold receptor TRPM8. We identified 33 TRPM8 alternative mRNAs (24 new sequences) and their associated protein isoforms in human tissues. Using comparative genomics, we described the evolution of the human TRPM8 sequences in eight ancestors since the origin of Amniota, and estimated in which ancestors the new TRPM8 variants originated. In order to validate the estimated origins of this receptor, we performed experimental validations of predicted exons in mouse tissues. Our results suggest a first diversification event of the cold receptor in the Boreoeutheria ancestor, and a subsequent divergence at the origin of Simiiformes.


Assuntos
Temperatura Baixa , Evolução Molecular , Mentol/metabolismo , Canais de Cátion TRPM/genética , Processamento Alternativo/genética , Animais , Linhagem Celular Tumoral , Éxons/genética , Variação Genética , Células HEK293 , Humanos , Camundongos , Fases de Leitura Aberta/genética , Filogenia , Isoformas de Proteínas/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Canais de Cátion TRPM/metabolismo
7.
Biol Chem ; 400(3): 367-381, 2019 02 25.
Artigo em Inglês | MEDLINE | ID: mdl-30763032

RESUMO

For evolutionary studies, but also for protein engineering, ancestral sequence reconstruction (ASR) has become an indispensable tool. The first step of every ASR protocol is the preparation of a representative sequence set containing at most a few hundred recent homologs whose composition determines decisively the outcome of a reconstruction. A common approach for sequence selection consists of several rounds of manual recompilation that is driven by embedded phylogenetic analyses of the varied sequence sets. For ASR of a geranylgeranylglyceryl phosphate synthase, we additionally utilized FitSS4ASR, which replaces this time-consuming protocol with an efficient and more rational approach. FitSS4ASR applies orthogonal filters to a set of homologs to eliminate outlier sequences and those bearing only a weak phylogenetic signal. To demonstrate the usefulness of FitSS4ASR, we determined experimentally the oligomerization state of eight predecessors, which is a delicate and taxon-specific property. Corresponding ancestors deduced in a manual approach and by means of FitSS4ASR had the same dimeric or hexameric conformation; this concordance testifies to the efficiency of FitSS4ASR for sequence selection. FitSS4ASR-based results of two other ASR experiments were added to the Supporting Information. Program and documentation are available at https://gitlab.bioinf.ur.de/hek61586/FitSS4ASR.


Assuntos
Alquil e Aril Transferases/genética , Software , Alquil e Aril Transferases/isolamento & purificação , Alquil e Aril Transferases/metabolismo , Sequência de Aminoácidos , Clonagem Molecular , Evolução Molecular , Filogenia , Engenharia de Proteínas , Fatores de Tempo
8.
Bioinformatics ; 34(4): 585-591, 2018 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-29040406

RESUMO

Motivation: Advances in the sequencing of uncultured environmental samples, dubbed metagenomics, raise a growing need for accurate taxonomic assignment. Accurate identification of organisms present within a community is essential to understanding even the most elementary ecosystems. However, current high-throughput sequencing technologies generate short reads which partially cover full-length marker genes and this poses difficult bioinformatic challenges for taxonomy identification at high resolution. Results: We designed MATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph. It is applied to the assembly of 16S rRNA markers and is validated on simulated, synthetic and genuine metagenomes. We show that MATAM outperforms other available methods in terms of low error rates and recovered fractions and is suitable to provide improved assemblies for precise taxonomic assignments. Availability and implementation: https://github.com/bonsai-team/matam. Contact: pierre.pericard@gmail.com or helene.touzet@univ-lille1.fr. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Microbioma Gastrointestinal/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenoma , Filogenia , Software , Algoritmos , Humanos , Metagenômica/métodos , RNA Ribossômico 16S/genética , Análise de Sequência de DNA/métodos
9.
BMC Genomics ; 17(Suppl 10): 786, 2016 11 11.
Artigo em Inglês | MEDLINE | ID: mdl-28185551

RESUMO

BACKGROUND: Transcriptome reconstruction, defined as the identification of all protein isoforms that may be expressed by a gene, is a notably difficult computational task. With real data, the best methods based on RNA-seq data identify barely 21 % of the expressed transcripts. While waiting for algorithms and sequencing techniques to improve - as has been strongly suggested in the literature - it is important to evaluate assisted transcriptome prediction; this is the question of how alternative transcription in one species performs as a predictor of protein isoforms in another relatively close species. Most evidence-based gene predictors use transcripts from other species to annotate a genome, but the predictive power of procedures that use exclusively transcripts from external species has never been quantified. The cornerstone of such an evaluation is the correct identification of pairs of transcripts with the same splicing patterns, called splicing orthologs. RESULTS: We propose a rigorous procedural definition of splicing orthologs, based on the identification of all ortholog pairs of splicing sites in the nucleotide sequences, and alignments at the protein level. Using our definition, we compared 24 382 human transcripts and 17 909 mouse transcripts from the highly curated CCDS database, and identified 11 122 splicing orthologs. In prediction mode, we show that human transcripts can be used to infer over 62 % of mouse protein isoforms. When restricting the predictions to transcripts known eight years ago, the percentage grows to 74 %. Using CCDS timestamped releases, we also analyze the evolution of the number of splicing orthologs over the last decade. CONCLUSIONS: Alternative splicing is now recognized to play a major role in the protein diversity of eukaryotic organisms, but definitions of spliced isoform orthologs are still approximate. Here we propose a definition adapted to the subtle variations of conserved alternative splicing sites, and use it to validate numerous accurate orthologous isoform predictions.


Assuntos
Algoritmos , Proteínas/genética , Transcriptoma , Processamento Alternativo , Animais , Biologia Computacional , Éxons , Humanos , Camundongos , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas/química , Proteínas/metabolismo , RNA/química , RNA/genética , RNA/metabolismo
10.
BMC Genomics ; 16 Suppl 5: S6, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26040958

RESUMO

BACKGROUND: In the context of ancestral gene order reconstruction from extant genomes, there exist two main computational approaches: rearrangement-based, and homology-based methods. The rearrangement-based methods consist in minimizing a total rearrangement distance on the branches of a species tree. The homology-based methods consist in the detection of a set of potential ancestral contiguity features, followed by the assembling of these features into Contiguous Ancestral Regions (CARs). RESULTS: In this paper, we present a new homology-based method that uses a progressive approach for both the detection and the assembling of ancestral contiguity features into CARs. The method is based on detecting a set of potential ancestral adjacencies iteratively using the current set of CARs at each step, and constructing CARs progressively using a 2-phase assembling method. CONCLUSION: We show the usefulness of the method through a reconstruction of the boreoeutherian ancestral gene order, and a comparison with three other homology-based methods: AnGeS, InferCARs and GapAdj. The program, written in Python, and the dataset used in this paper are available at http://bioinfo.lifl.fr/procars/.


Assuntos
Grupos de População Animal/genética , Biologia Computacional/métodos , Genoma/genética , Genômica/métodos , Grupos Populacionais/genética , Algoritmos , Animais , Evolução Molecular , Humanos , Modelos Genéticos , Filogenia
11.
J Am Chem Soc ; 136(1): 122-9, 2014 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-24364418

RESUMO

Due to the lack of macromolecular fossils, the enzymatic repertoire of extinct species has remained largely unknown to date. In an attempt to solve this problem, we have characterized a cyclase subunit (HisF) of the imidazole glycerol phosphate synthase (ImGP-S), which was reconstructed from the era of the last universal common ancestor of cellular organisms (LUCA). As observed for contemporary HisF proteins, the crystal structure of LUCA-HisF adopts the (ßα)8-barrel architecture, one of the most ancient folds. Moreover, LUCA-HisF (i) resembles extant HisF proteins with regard to internal 2-fold symmetry, active site residues, and a stabilizing salt bridge cluster, (ii) is thermostable and shows a folding mechanism similar to that of contemporary (ßα)8-barrel enzymes, (iii) displays high catalytic activity, and (iv) forms a stable and functional complex with the glutaminase subunit (HisH) of an extant ImGP-S. Furthermore, we show that LUCA-HisF binds to a reconstructed LUCA-HisH protein with high affinity. Our findings suggest that the evolution of highly efficient enzymes and enzyme complexes has already been completed in the LUCA era, which means that sophisticated catalytic concepts such as substrate channeling and allosteric communication existed already 3.5 billion years ago.


Assuntos
Evolução Molecular , Complexos Multienzimáticos/química , Complexos Multienzimáticos/metabolismo , Aminoidrolases/química , Aminoidrolases/genética , Aminoidrolases/metabolismo , Archaea/enzimologia , Archaea/genética , Cristalografia por Raios X , Extinção Biológica , Modelos Moleculares , Dobramento de Proteína , Estrutura Secundária de Proteína
12.
Biol Lett ; 9(5): 20130608, 2013 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-24046876

RESUMO

Several lines of evidence such as the basal location of thermophilic lineages in large-scale phylogenetic trees and the ancestral sequence reconstruction of single enzymes or large protein concatenations support the conclusion that the ancestors of the bacterial and archaeal domains were thermophilic organisms which were adapted to hot environments during the early stages of the Earth. A parsimonious reasoning would therefore suggest that the last universal common ancestor (LUCA) was also thermophilic. Various authors have used branch-wise non-homogeneous evolutionary models that better capture the variation of molecular compositions among lineages to accurately reconstruct the ancestral G + C contents of ribosomal RNAs and the ancestral amino acid composition of highly conserved proteins. They confirmed the thermophilic nature of the ancestors of Bacteria and Archaea but concluded that LUCA, their last common ancestor, was a mesophilic organism having a moderate optimal growth temperature. In this letter, we investigate the unknown nature of the phylogenetic signal that informs ancestral sequence reconstruction to support this non-parsimonious scenario. We find that rate variation across sites of molecular sequences provides information at different time scales by recording the oldest adaptation to temperature in slow-evolving regions and subsequent adaptations in fast-evolving ones.


Assuntos
Adaptação Fisiológica , Temperatura Baixa , Planeta Terra , Vida , Modelos Teóricos , Filogenia
13.
BMC Evol Biol ; 11: 70, 2011 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-21406081

RESUMO

BACKGROUND: Plasmodium falciparum is responsible for the most acute form of human malaria. Most recent studies demonstrate that it belongs to a monophyletic lineage specialized in the infection of great ape hosts. Several other Plasmodium species cause human malaria. They all belong to another distinct lineage of parasites which infect a wider range of primate species. All known mammalian malaria parasites appear to be monophyletic. Their clade includes the two previous distinct lineages of parasites of primates and great apes, one lineage of rodent parasites, and presumably Hepatocystis species. Plasmodium falciparum and great ape parasites are commonly thought to be the sister-group of all other mammal-infecting malaria parasites. However, some studies supported contradictory origins and found parasites of great apes to be closer to those of rodents, or to those of other primates. RESULTS: To distinguish between these mutually exclusive hypotheses on the origin of Plasmodium falciparum and its great ape infecting relatives, we performed a comprehensive phylogenetic analysis based on a data set of three mitochondrial genes from 33 to 84 malaria parasites. We showed that malarial mitochondrial genes have evolved slowly and are compositionally homogeneous. We estimated their phylogenetic relationships using Bayesian and maximum-likelihood methods. Inferred trees were checked for their robustness to the (i) site selection, (ii) assumptions of various probabilistic models, and (iii) taxon sampling. Our results robustly support a common ancestry of rodent parasites and Plasmodium falciparum's relatives infecting great apes. CONCLUSIONS: Our results refute the most common view of the origin of great ape malaria parasites, and instead demonstrate the robustness of a less well-established phylogenetic hypothesis, under which Plasmodium falciparum and its relatives infecting great apes are closely related to rodent parasites. This study sheds light on the evolutionary history of Plasmodium falciparum, a major issue for human health.


Assuntos
Evolução Molecular , Genes Mitocondriais , Hominidae/parasitologia , Plasmodium falciparum/genética , Animais , Teorema de Bayes , DNA Mitocondrial/genética , DNA de Protozoário/genética , Genoma Mitocondrial , Funções Verossimilhança , Malária/parasitologia , Modelos Genéticos , Filogenia , Plasmodium falciparum/classificação , Alinhamento de Sequência , Análise de Sequência de DNA
14.
J Mol Biol ; 398(5): 763-73, 2010 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-20363228

RESUMO

The evolution of the prototypical (beta alpha)(8)-barrel protein imidazole glycerol phosphate synthase (HisF) was studied by complementary computational and experimental approaches. The 4-fold symmetry of HisF suggested that its constituting (beta alpha)(2) quarter-barrels have a common evolutionary origin. This conclusion was supported by the computational reconstruction of the HisF sequence of the last common ancestor, which showed that its quarter-barrels were more similar to each other than are those of extant HisF proteins. A comprehensive sequence analysis identified HisF-N1 [corresponding to (beta alpha)(1-2)] as the slowest evolving quarter-barrel. This finding indicated that it is the closest relative of the common (beta alpha)(2) predecessor, which must have been a stable and presumably tetrameric protein. In accordance with this prediction, a recombinantly produced HisF-N1 protein was properly folded and formed a tetramer being stabilised by disulfide bonds. The introduction of a disulfide bond in HisF-C1 [corresponding to (beta alpha)(5-6)] also resulted in the formation of a stable tetramer. The fusion of two identical HisF-N1 quarter-barrels yielded the stable dimeric half-barrel HisF-N1N1. Our findings suggest a two-step evolutionary pathway in which a HisF-N1-like predecessor was duplicated and fused twice to yield HisF. Most likely, the (beta alpha)(2) quarter-barrel and (beta alpha)(4) half-barrel intermediates on this pathway were stabilised by disulfide bonds that became dispensable upon consolidation of the (beta alpha)(8)-barrel.


Assuntos
Aminoidrolases/química , Aminoidrolases/genética , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Evolução Molecular , Dobramento de Proteína , Thermotoga maritima/enzimologia , Aminoidrolases/metabolismo , Proteínas de Bactérias/metabolismo , Biologia Computacional , Dissulfetos , Modelos Moleculares , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo
15.
BMC Genomics ; 10: 534, 2009 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-19922605

RESUMO

BACKGROUND: Tunicates represent a key metazoan group as the sister-group of vertebrates within chordates. The six complete mitochondrial genomes available so far for tunicates have revealed distinctive features. Extensive gene rearrangements and particularly high evolutionary rates have been evidenced with regard to other chordates. This peculiar evolutionary dynamics has hampered the reconstruction of tunicate phylogenetic relationships within chordates based on mitogenomic data. RESULTS: In order to further understand the atypical evolutionary dynamics of the mitochondrial genome of tunicates, we determined the complete sequence of the solitary ascidian Herdmania momus. This genome from a stolidobranch ascidian presents the typical tunicate gene content with 13 protein-coding genes, 2 rRNAs and 24 tRNAs which are all encoded on the same strand. However, it also presents a novel gene arrangement, highlighting the extreme plasticity of gene order observed in tunicate mitochondrial genomes. Probabilistic phylogenetic inferences were conducted on the concatenation of the 13 mitochondrial protein-coding genes from representatives of major metazoan phyla. We show that whereas standard homogeneous amino acid models support an artefactual sister position of tunicates relative to all other bilaterians, the CAT and CAT+BP site- and time-heterogeneous mixture models place tunicates as the sister-group of vertebrates within monophyletic chordates. Moreover, the reference phylogeny indicates that tunicate mitochondrial genomes have experienced a drastic acceleration in their evolutionary rate that equally affects protein-coding and ribosomal-RNA genes. CONCLUSION: This is the first mitogenomic study supporting the new chordate phylogeny revealed by recent phylogenomic analyses. It illustrates the beneficial effects of an increased taxon sampling coupled with the use of more realistic amino acid substitution models for the reconstruction of animal phylogeny.


Assuntos
Genoma Mitocondrial/genética , Genômica , Filogenia , Urocordados/genética , Animais , Sequência de Bases , DNA Mitocondrial/genética , Evolução Molecular , Ordem dos Genes , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , RNA Ribossômico/genética , RNA de Transferência/genética , Urocordados/citologia
16.
Bioinformatics ; 25(17): 2286-8, 2009 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-19535536

RESUMO

MOTIVATION: A variety of probabilistic models describing the evolution of DNA or protein sequences have been proposed for phylogenetic reconstruction or for molecular dating. However, there still lacks a common implementation allowing one to freely combine these independent features, so as to test their ability to jointly improve phylogenetic and dating accuracy. RESULTS: We propose a software package, PhyloBayes 3, which can be used for conducting Bayesian phylogenetic reconstruction and molecular dating analyses, using a large variety of amino acid replacement and nucleotide substitution models, including empirical mixtures or non-parametric models, as well as alternative clock relaxation processes.


Assuntos
Teorema de Bayes , Biologia Computacional/métodos , Filogenia , Software , Animais , Bases de Dados de Ácidos Nucleicos , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Fatores de Tempo
17.
Nature ; 456(7224): 942-5, 2008 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-19037246

RESUMO

Fossils of organisms dating from the origin and diversification of cellular life are scant and difficult to interpret, for this reason alternative means to investigate the ecology of the last universal common ancestor (LUCA) and of the ancestors of the three domains of life are of great scientific value. It was recently recognized that the effects of temperature on ancestral organisms left 'genetic footprints' that could be uncovered in extant genomes. Accordingly, analyses of resurrected proteins predicted that the bacterial ancestor was thermophilic and that Bacteria subsequently adapted to lower temperatures. As the archaeal ancestor is also thought to have been thermophilic, the LUCA was parsimoniously inferred as thermophilic too. However, an analysis of ribosomal RNAs supported the hypothesis of a non-hyperthermophilic LUCA. Here we show that both rRNA and protein sequences analysed with advanced, realistic models of molecular evolution provide independent support for two environmental-temperature-related phases during the evolutionary history of the tree of life. In the first period, thermotolerance increased from a mesophilic LUCA to thermophilic ancestors of Bacteria and of Archaea-Eukaryota; in the second period, it decreased. Therefore, the two lineages descending from the LUCA and leading to the ancestors of Bacteria and Archaea-Eukaryota convergently adapted to high temperatures, possibly in response to a climate change of the early Earth, and/or aided by the transition from an RNA genome in the LUCA to organisms with more thermostable DNA genomes. This analysis unifies apparently contradictory results into a coherent depiction of the evolution of an ecological trait over the entire tree of life.


Assuntos
Adaptação Fisiológica/fisiologia , Archaea/fisiologia , Temperatura Alta , Adaptação Fisiológica/genética , Archaea/genética , Evolução Molecular , Genes de RNAr/genética , Filogenia
18.
Mol Biol Evol ; 25(5): 842-58, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18234708

RESUMO

We combined the category (CAT) mixture model (Lartillot N, Philippe H. 2004) and the nonstationary break point (BP) model (Blanquart S, Lartillot N. 2006) into a new model, CAT-BP, accounting for variations of the evolutionary process both along the sequence and across lineages. As in CAT, the model implements a mixture of distinct Markovian processes of substitution distributed among sites, thus accommodating site-specific selective constraints induced by protein structure and function. Furthermore, as in BP, these processes are nonstationary, and their equilibrium frequencies are allowed to change along lineages in a correlated way, through discrete shifts in global amino acid composition distributed along the phylogenetic tree. We implemented the CAT-BP model in a Bayesian Markov Chain Monte Carlo framework and compared its predictions with those of 3 simpler models, BP, CAT, and the site- and time-homogeneous general time-reversible (GTR) model, on a concatenation of 4 mitochondrial proteins of 20 arthropod species. In contrast to GTR, BP, and CAT, which all display a phylogenetic reconstruction artifact positioning the bees Apis mellifera and Melipona bicolor among chelicerates, the CAT-BP model is able to recover the monophyly of insects. Using posterior predictive tests, we further show that the CAT-BP combination yields better anticipations of site- and taxon-specific amino acid frequencies and that it better accounts for the homoplasies that are responsible for the artifact. Altogether, our results show that the joint modeling of heterogeneities across sites and along time results in a synergistic improvement of the phylogenetic inference, indicating that it is essential to disentangle the combined effects of both sources of heterogeneity, in order to overcome systematic errors in protein phylogenetic analyses.


Assuntos
Substituição de Aminoácidos , Artrópodes/genética , Evolução Molecular , Proteínas Mitocondriais/genética , Modelos Genéticos , Sequência de Aminoácidos , Animais , Artrópodes/química , Proteínas Mitocondriais/química , Método de Monte Carlo , Filogenia , Tempo
19.
BMC Evol Biol ; 7: 146, 2007 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-17725830

RESUMO

BACKGROUND: Chaetognaths, or arrow worms, are small marine, bilaterally symmetrical metazoans. The objective of this study was to analyse ribosomal protein (RP) coding sequences from a published collection of expressed sequence tags (ESTs) from a chaetognath (Spadella cephaloptera) and to use them in phylogenetic studies. RESULTS: This analysis has allowed us to determine the complete primary structures of 23 out of 32 RPs from the small ribosomal subunit (SSU) and 32 out of 47 RPs from the large ribosomal subunit (LSU). Ten proteins are partially determined and 14 proteins are missing. Phylogenetic analyses of concatenated RPs from six animals (chaetognath, echinoderm, mammalian, insect, mollusc and sponge) and one fungal taxa do not resolve the chaetognath phylogenetic position, although each mega-sequence comprises approximately 5,000 amino acid residues. This is probably due to the extremely biased base composition and to the high evolutionary rates in chaetognaths. However, the analysis of chaetognath RP genes revealed three unique features in the animal Kingdom. First, whereas generally in animals one RP appeared to have a single type of mRNA, two or more genes are generally transcribed for one RP type in chaetognath. Second, cDNAs with complete 5'-ends encoding a given protein sequence can be divided in two sub-groups according to a short region in their 5'-ends: two novel and highly conserved elements have been identified (5'-TAATTGAGTAGTTT-3' and 5'-TATTAAGTACTAC-3') which could correspond to different transcription factor binding sites on paralog RP genes. And, third, the overall number of deduced paralogous RPs is very high compared to those published for other animals. CONCLUSION: These results suggest that in chaetognaths the deleterious effects of the presence of paralogous RPs, such as apoptosis or cancer are avoided, and also that in each protein family, some of the members could have tissue-specific and extra-ribosomal functions. These results are congruent with the hypotheses of an allopolyploid origin of this phylum and of a ribosome heterogeneity.


Assuntos
Invertebrados/genética , Biossíntese de Proteínas , Proteínas Ribossômicas/genética , Sequência de Aminoácidos , Animais , DNA Complementar , Evolução Molecular , Etiquetas de Sequências Expressas , Invertebrados/classificação , Filogenia , Isoformas de Proteínas/genética , Alinhamento de Sequência
20.
Mol Biol Evol ; 23(11): 2058-71, 2006 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16931538

RESUMO

Variations of nucleotidic composition affect phylogenetic inference conducted under stationary models of evolution. In particular, they may cause unrelated taxa sharing similar base composition to be grouped together in the resulting phylogeny. To address this problem, we developed a nonstationary and nonhomogeneous model accounting for compositional biases. Unlike previous nonstationary models, which are branchwise, that is, assume that base composition only changes at the nodes of the tree, in our model, the process of compositional drift is totally uncoupled from the speciation events. In addition, the total number of events of compositional drift distributed across the tree is directly inferred from the data. We implemented the method in a Bayesian framework, relying on Markov Chain Monte Carlo algorithms, and applied it to several nucleotidic data sets. In most cases, the stationarity assumption was rejected in favor of our nonstationary model. In addition, we show that our method is able to resolve a well-known artifact. By Bayes factor evaluation, we compared our model with 2 previously developed nonstationary models. We show that the coupling between speciations and compositional shifts inherent to branchwise models may lead to an overparameterization, resulting in a lesser fit. In some cases, this leads to incorrect conclusions, concerning the nature of the compositional biases. In contrast, our compound model more flexibly adapts its effective number of parameters to the data sets under investigation. Altogether, our results show that accounting for nonstationary sequence evolution may require more elaborate and more flexible models than those currently used.


Assuntos
Teorema de Bayes , Evolução Molecular , Modelos Genéticos , Modelos Teóricos , Processos Estocásticos , Animais , Simulação por Computador , Humanos , Método de Monte Carlo , Filogenia , Análise de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...